Goto

Collaborating Authors

 information field


Universal Reinforcement Learning in Coalgebras: Asynchronous Stochastic Computation via Conduction

Mahadevan, Sridhar

arXiv.org Artificial Intelligence

In this paper, we introduce a categorial generalization of RL, termed universal reinforcement learning (URL), building on powerful mathematical abstractions from the study of coinduction on non-well-founded sets and universal coalgebras, topos theory, and categorial models of asynchronous parallel distributed computation. In the first half of the paper, we review the basic RL framework, illustrate the use of categories and functors in RL, showing how they lead to interesting insights. In particular, we also introduce a standard model of asynchronous distributed minimization proposed by Bertsekas and Tsitsiklis, and describe the relationship between metric coinduction and their proof of the Asynchronous Convergence Theorem. The space of algorithms for MDPs or PSRs can be modeled as a functor category, where the co-domain category forms a topos, which admits all (co)limits, possesses a subobject classifier, and has exponential objects. In the second half of the paper, we move on to universal coalgebras. Dynamical system models, such as Markov decision processes (MDPs), partially observed MDPs (POMDPs), a predictive state representation (PSRs), and linear dynamical systems (LDSs) are all special types of coalgebras. We describe a broad family of universal coalgebras, extending the dynamic system models studied previously in RL. The core problem in finding fixed points in RL to determine the exact or approximate (action) value function is generalized in URL to determining the final coalgebra asynchronously in a parallel distributed manner.


Universal Decision Models

Mahadevan, Sridhar

arXiv.org Artificial Intelligence

Humans are universal decision makers: we reason causally to understand the world; we act competitively to gain advantage in commerce, games, and war; and we are able to learn to make better decisions through trial and error. In this paper, we propose Universal Decision Model (UDM), a mathematical formalism based on category theory. Decision objects in a UDM correspond to instances of decision tasks, ranging from causal models and dynamical systems such as Markov decision processes and predictive state representations, to network multiplayer games and Witsenhausen's intrinsic models, which generalizes all these previous formalisms. A UDM is a category of objects, which include decision objects, observation objects, and solution objects. Bisimulation morphisms map between decision objects that capture structure-preserving abstractions. We formulate universal properties of UDMs, including information integration, decision solvability, and hierarchical abstraction. We describe universal functorial representations of UDMs, and propose an algorithm for computing the minimal object in a UDM using algebraic topology. We sketch out an application of UDMs to causal inference in network economics, using a complex multiplayer producer-consumer two-sided marketplace.


Causal Homotopy

Mahadevan, Sridhar

arXiv.org Artificial Intelligence

We characterize homotopical equivalences between causal DAG models, exploiting the close connections between partially ordered set representations of DAGs (posets) and finite Alexandroff topologies. Alexandroff spaces yield a directional topological space: the topology is defined by a unique minimal basis defined by an open set for each variable x, specified as the intersection of all open sets containing x. Alexandroff spaces induce a (reflexive, transitive) preorder. Alexandroff spaces satisfying the Kolmogorov T0 separation criterion, where open sets distinguish variables, converts the preordering into a partial ordering. Our approach broadly is to construct a topological representation of posets from data, and then use the poset representation to build a conventional DAG causal model. We illustrate our framework by showing how it unifies disparate algorithms and case studies proposed previously. Topology plays two key roles in causal discovery. First, topological separability constraints on datasets have been used in several previous approaches to infer causal structure from observations and interventions. Second, a diverse range ofgraphical models used to represent causal structures can be represented in a unified way in terms of a topological representation of the induced poset structure. We show that the homotopy theory of Alexandroff spaces can be exploited to significantly efficiently reduce the number of possible DAG structures, reducing the search space by several orders of magnitude.